Strategy-Based Warm Starting for Regret Minimization in Games
نویسندگان
چکیده
Counterfactual Regret Minimization (CFR) is a popular iterative algorithm for approximating Nash equilibria in imperfect-information multi-step two-player zero-sum games. We introduce the first general, principled method for warm starting CFR. Our approach requires only a strategy for each player, and accomplishes the warm start at the cost of a single traversal of the game tree. The method provably warm starts CFR to as many iterations as it would have taken to reach a strategy profile of the same quality as the input strategies, and does not alter the convergence bounds of the algorithms. Unlike prior approaches to warm starting, ours can be applied in all cases. Our method is agnostic to the origins of the input strategies. For example, they can be based on human domain knowledge, the observed strategy of a strong agent, the solution of a coarser abstraction, or the output of some algorithm that converges rapidly at first but slowly as it gets closer to an equilibrium. Experiments demonstrate that one can improve overall convergence in a game by first running CFR on a smaller, coarser abstraction of the game and then using the strategy in the abstract game to warm start CFR in the full game.
منابع مشابه
Regret Transfer and Parameter Optimization
Regret matching is a widely-used algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides, to our...
متن کاملRegret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents
In two-player zero-sum games, if both players minimize their average external regret, then the average of the strategy profiles converges to a Nash equilibrium. For n-player general-sum games, however, theoretical guarantees for regret minimization are less understood. Nonetheless, Counterfactual Regret Minimization (CFR), a popular regret minimization algorithm for extensiveform games, has gen...
متن کاملRegret Minimization in Games with Incomplete Information
Extensive games are a powerful model of multiagent decision-making scenarioswith incomplete information. Finding a Nash equilibrium for very large instancesof these games has received a great deal of recent attention. In this paper, wedescribe a new technique for solving large games based on regret minimization.In particular, we introduce the notion of counterfactual regret, whi...
متن کاملAdaptive Regret Minimization in Bounded-Memory Games
Online learning algorithms that minimize regret provide strong guarantees in situations that involve repeatedly making decisions in an uncertain environment, e.g. a driver deciding what route to drive to work every day. While regret minimization has been extensively studied in repeated games, we study regret minimization for a richer class of games called bounded memory games. In each round of ...
متن کاملEfficient Regret Minimization in Non-Convex Games
We consider regret minimization in repeated games with non-convex loss functions. Minimizing the standard notion of regret is computationally intractable. Thus, we define a natural notion of regret which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum. We give gradient-based methods that achieve optimal regret, which in turn guar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016